Fast Dimension-based Partitioning and Merging clustering algorithm
نویسندگان
چکیده
Clustering multi-dense large scale high dimensional numeric datasets is a challenging task duo to high time complexity of most clustering algorithms. Nowadays, data collection tools produce a large amount of data. So, fast algorithms are vital requirement for clustering such data. In this paper, a fast clustering algorithm, called Dimension-based Partitioning and Merging (DPM), is proposed. In DPM, first, data is partitioned into small dense volumes during the successive processing of dataset dimensions. Then, noise is filtered out using dimensional densities of the generated partitions. Finally, merging process is invoked to construct clusters based on partition boundary data samples. DPM algorithm automatically detects the number of data clusters based on three insensitive tuning parameters which decrease the burden of its usage. Performance evaluation of the proposed algorithm using different datasets shows its fastness and accuracy compared to other clustering competitors. © 2015 Elsevier B.V. All rights reserved.
منابع مشابه
Partitioning of Public Transit Networks
Partitioning of public transit networks is a useful approach to speed up the query times or the preprocessing phase of public transit routing algorithms. The aim of partitioning of public transit networks is to find small partitions of the stations of a given network such that most of the traffic lies in their inside while only little traffic goes between them. To find such partitions of the pu...
متن کاملAssessment of the Performance of Clustering Algorithms in the Extraction of Similar Trajectories
In recent years, the tremendous and increasing growth of spatial trajectory data and the necessity of processing and extraction of useful information and meaningful patterns have led to the fact that many researchers have been attracted to the field of spatio-temporal trajectory clustering. The process and analysis of these trajectories have resulted in the extraction of useful information whic...
متن کاملMap-merging in Multi-robot Simultaneous Localization and Mapping Process Using Two Heterogeneous Ground Robots
In this article, a fast and reliable map-merging algorithm is proposed to produce a global two dimensional map of an indoor environment in a multi-robot simultaneous localization and mapping (SLAM) process. In SLAM process, to find its way in this environment, a robot should be able to determine its position relative to a map formed from its observations. To solve this complex problem, simultan...
متن کاملFast Non-Linear Dimension Reduction
We present a fast algorithm for non-linear dimension reduction. The algorithm builds a local linear model of the data by merging PCA with clustering based on a new distortion measure. Experiments with speech and image data indicate that the local linear algorithm produces encodings with lower distortion than those built by five layer auto-associative networks. The local linear algorithm is also...
متن کاملA partition-based algorithm for clustering large-scale software systems
Clustering techniques are used to extract the structure of software for understanding, maintaining, and refactoring. In the literature, most of the proposed approaches for software clustering are divided into hierarchical algorithms and search-based techniques. In the former, clustering is a process of merging (splitting) similar (non-similar) clusters. These techniques suffered from the drawba...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Appl. Soft Comput.
دوره 36 شماره
صفحات -
تاریخ انتشار 2015